The following content has been provided by the University of Erlangen-Nürnberg.
Today we will consider one topic that is very important for pattern recognition in general,
with many, many applications to many, many problems you might not even have thought of.
And if you attend lectures by Professor Kellermann, for instance, who is attending lectures by Professor Kellermann,
you might have seen this already because this is also used a lot in separating audio signals.
So if you have a mixture of signals, people talking to each other, many people talking to each other,
and you want to separate the different sources, then you use a technique called independent component analysis.
And that's actually what we are going to study today and tomorrow morning.
And also tomorrow morning I'm going to teach the next lecture, so let's see.
And the core idea basically is we have to solve a system of linear equations that is, in a sense, special,
so that both the observation and the entries of the matrix are unknown.
So we have a product of two numbers, and out of the product we have to compute the two factors.
For those of you who attend diagnostic medical image processing,
this is a problem that you might have seen already in defect pixel interpolation,
or we also discussed approaches of this or problems of this kind in a chapter on removing inhomogeneities in MRI images,
where we also said we observe a product of two numbers and we do the factorization in a way that we have both the original image and the artifact.
What we are going to consider is the so-called cocktail party problem,
which becomes a problem that is harder and harder with your age.
If you are getting older and older, usually your ears and your brain is also not separating the different sources properly anymore.
And what we want to do is we want to find algorithms that basically separate the different sources.
So imagine the following situation. Welcome!
You know why I was not teaching last week? I was in Peking.
Did you follow the press last week? There was the highest smog alarm ever, and I have been there. Can you imagine?
It was the first time that I was in Peking, and exactly that day we had the new record.
But it was fun. I will let you know something about that later.
So what type of problem do we want to solve? Think about being in Beijing, in the underground.
A lot of people are talking and you want to separate certain signals.
So you have two microphones in the underground of Peking with different locations,
and the microphones record time signals. We call them X1 and X2.
So these are, let's say, two speakers. One is Ching Feng and the other one is Lin Chao.
I learned Chinese, you see. So Ching Feng and Lin Chao.
One is a male and the other one is a female name, by the way.
I am still working on the trick how to classify female and male names. It's impossible for me.
You're going to teach me, right?
So each recorded signal is a weighted sum of two speakers. So they speak both.
One is more distant, the other one is closer, and they talk, blah, blah, blah, blah, blah,
and I receive a mixture.
So what I record is the recorded signal. Oh, sorry. That's the recorded signal.
X1 and X2 is just the weighted sum of the first speaker and the second speaker, S1 and S2.
So these are the two speakers. So I'm sure you will remember these things.
Oops. I have to practice again. I haven't done this for weeks now.
So this is here Ching Feng and this is Lin Chao. Okay? Good. For the exam, you need to know that.
So we have the two sources. And now the parameters A, I, J, so these are the mixture components,
depend on the distance of the microphones to the speakers.
Now that's also something that is intuitively clear to us.
And for simplicity, we assume that we have constant weighting factors. They do not change over time.
And basically, if you have to solve this type of problem, you say, okay, if I know the weights,
basically the two by two matrix in this particular case we are considering,
then you can reconstruct the signals by just solving a system of linear equations.
So if I give to you the weights here, A11 and A12, and you have the, this is the observed signal,
Presenters
Zugänglich über
Offener Zugang
Dauer
00:41:03 Min
Aufnahmedatum
2013-01-21
Hochgeladen am
2013-01-21 16:43:10
Sprache
en-US